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Abstract — We study the problem of optimal sequential ("as- 
you-go") deployment of wireless relay nodes, as a person walks 
along a line of random length (with a known distribution). The 
objective is to create an impromptu multihop wireless network 
for connecting a packet source to be placed at the end of the line 
with a sink node located at the starting point, to operate in the 
light traffic regime. In walking from the sink towards the source, 
at every step, measurements yield the transmit powers required 
to establish links to one or more previously placed nodes. Based 
on these measurements, at every step, a decision is made to place 
a relay node, the overall system objective being to minimize a 
linear combination of the expected sum power (or the expected 
maximum power) required to deliver a packet from the source to 
the sink node and the expected number of relay nodes deployed. 
For each of these two objectives, two different relay selection 
strategies are considered: (i) each relay communicates with the 
sink via its immediate previous relay, (ii) the communication 
path can skip some of the deployed relays. With appropriate 
modeling assumptions, we formulate each of these problems as a 
Markov decision process (MDP). We provide the optimal policy 
structures for all these cases, and provide illustrations of the 
policies and their performance, via numerical results, for some 
typical parameters. 

I. Introduction 

Wireless interconnection of resource-constrained mobile 
user devices or wireless sensors to the wireline infrastructure 
via relay nodes is an important requirement, since a direct 
one-hop link from the source node to the infrastructure "base- 
station" may not always be feasible, due to distance or poor 
channel condition. The relays could be battery operated radio 
routers or other wireless sensors in the wireless sensor network 
context, or other users' devices in the cellular context. The 
relays are also resource constrained and a cost might be 
involved in engaging or placing them. Hence, there arises the 
problem of optimal relay placement. 

Motivated by the above larger problem, we consider the 
problem of "as-you-go" deployment of relay nodes along a 
line, between a sink node and a source node (see Figure [TJ, 
where the deployment operative starts from the sink node, 
places relay nodes along the line, and places the source node 
where the line ends. The problem is motivated by the need 
for impromptu deployment of wireless networks by "first 
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Fig. 1. A line network with one source, one sink and three relays. 

responders," for situation monitoring in an emergency such 
as a building fire or a terrorist siege. Such problems can 
also arise when deploying wireless sensor networks in large 
difficult terrains (such as forests) where it is difficult to plan 
a deployment due to the unavailability of a precise map of 
the terrain, or when such networks need to be deployed and 
redeployed quickly and there is little time in between to plan, 
or in situations where the deployment needs to be stealthy 
(for example, when deploying sensor networks for detecting 
poachers or fugitives in a forest). 

In this paper, we consider the problem of as-you-go de- 
ployment along a line of unknown random length, L, whose 
distribution is known. The transmit power required to establish 
a link (of a certain minimum quality) between any two nodes 
is modeled by a random variable capturing the effect of path- 
loss and shadowing. There is a cost for placing a relay, and 
the communication cost of a deployment is measured as some 
function of the powers required to communicate over the links. 
We consider two performance measures: the sum-power and 
the max-power along the path from the source to the sink, 
under two different relay selection sttategies: (i) each relay 
communicates with the sink via its immediate previous relay, 
(ii) the communication path can skip some of the deployed 
relays. Under certain assumptions on the distribution of L, 
and the powers required at the relays, we formulate each of the 
sequential placement problems as a total cost Markov decision 
process (MDP). 

The optimal policies for various MDPs formulated in this 
paper turn out to be threshold policies; the decision to place 
a relay at a given location involves the power required to 
establish a link to one or more previous nodes, and the distance 
to one or more previous nodes (depending on the objective and 
the relay selection sttategy). Our analysis and numerical work 
also suggest that allowing the possibility of skipping some of 
the deployed relays may result in a reduction in the total cost. 



A. Related Work 

There has been increasing interest in the research commu- 
nity to explore the impromptu relay placement problem in 
recent years. Howard et al., in (TJ, provide heuristic algorithms 
for incremental deployment of sensors with the objective of 
covering the deployment area. Souryal et al., in (2), address the 
problem of impromptu deployment of static wireless networks 
with an extensive study of indoor RF link quality variation. 
The work reported in |3] use similar approach for relay 
deployment. Recently, Liu et al. ([4j) describe a breadcrumbs 
system for aiding firefighters inside buildings. However, there 
has been little effort to rigorously formulate the problem in 
order to derive optimal policies, from which insights can 
be gained, and which can be compared in performance to 
reasonable heuristics. Recently, Sinha et al. (|5|) have provided 
an MDP formulation for establishing a multi-hop network 
between a destination and an unknown source location by 
placing relay nodes along a random lattice path. They assume a 
given deterministic mapping between power and wireless link 
length, and, hence, do not consider the statistical variability 
(due to shadowing) of the transmit power required to maintain 
the link quality over links of a given length. We, however, 
consider such variability, and therefore bring in the idea of 
measurement based impromptu placement. 

B. Organization 

In Section |IlJ the system model and the basic notation used 
in this work are discussed. 
In Section 



III 



the problem of sequential relay placement 
is addressed, under the assumption that a packet originating 
from the source makes a hop-by-hop traversal through all relay 
nodes. We formulate the problems with sum power and max- 
power objectives as MDPs and establish the optimal policy 
structures analytically. We show that that, in each case, the 
decision to place or not to place at the current position depends 
on a comparison of the transmit power for establishing a link 
from the current position, with a threshold that depends on the 
state of the Markov decision process. 

In Section IV we again address the same problems as in 
Section [IIIJ but we relax the restriction that the links of the 
path from the source to the sink must be between adjacent 
deployed relays. This relaxation leads us to the formulation 
of MDPs with a more complicated state space. The optimal 
policies again turn out to be threshold policies. We provide 
numerical examples, using parameters similar to those that 
occur in commercially available wireless sensor networking 
devices. The performance improvement due to skipping relays 
is demostrated. 

We conclude in Section IV1 

II. System Model and Notation 
A. Length of the Line 

The length L of the line is a priori unknown, but there 
is prior information (e.g., the mean length L) that leads us 
to model L as a geometrically distributed number of steps. 
The step length S (whose values will typically be several 



meters, e.g., 2 meters in our numerical work) and the mean 
length of the line, L, can be used to obtain the parameter of 
the geometric distribution, i.e., the probability 9 that the line 
ends at the next step. In the problem formulation, we assume 
5 = 1 for simplicity^] All distances are assumed to be integer 
multiples of 6. 

B. Deployment Process and Some Notation 

As the person walks along the line, at each step he measures 
the link quality from the current location to one or more than 
one previous node and accordingly decides whether to place 
a relay at the current location or not. After the deployment 
process is complete (at the end of the line where the source 
is placed), we denote the number of deployed relays by N, 
which is a random number, with the randomness coming from 
the randomness in the link qualities and in the length of the 
line. As shown in Figure [T] the sink is called Node 0, the relay 
closest to the sink is called Node 1, and finally the source is 
called Node (N + 1). The link whose transmitter is Node i 
and receiver is Node j is called link A generic link is 

denoted by e. 

C. Traffic Model 

We consider a traffic model where the traffic is so low 
that there is only one packet in the network at a time; we 
call this the "lone packet model." As a consequence of this 
assumption, (i) the transmit power required over each link 
depends only on the propagation loss over that link, as there 
are no simultaneous transmissions to cause interference, and 
(ii) the transmission delay over each link is easily calculated, 
as there are no simultaneous transmitters to contend with. This 
permits us to easily write down the communication cost on a 
path over the deployed relays. 

It was shown in |6| that a network operating under 
CSMA/CA medium access, and designed for carrying any 
positive traffic, with some QoS (in terms of the packet delivery 
probability), must necessarily be able to provide the desired 
QoS to lone packet traffic. As-you-go deployment of wireless 
networks while meeting QoS objectives for specific positive 
packet arrival rates is a topic of our ongoing research. 

D. Channel Model 



For our network performance objective (see Section |II-E| >, 
we need the transmit power required to sustain a certain quality 
of communication over a link. In order to model this required 
power, we consider the usual aspects of path-loss, shadowing, 
and fading. A link is considered to be in outage if the received 
signal strength drops below P rC v-min (due to fading) (e.g., 
below -88 dBm). The transmit power that we use is such that 
the probability of outage is less than a small value (say 5%). 
For a generic link of length r, we denote by T r the transmit 
power required; due to shadowing, this is modeled as a random 
variable. Since practical radios can only be set to transmit 



The geometric distribution is the maximum entropy discrete probability 
mass function with a given mean. Thus, by using the geometric distribution, 
we are leaving the length of the line as uncertain as we can, given the prior 
knowledge of L. 



at a finite set of power levels, the random variable T r takes 
values from a discrete set, S. The distribution function of r r is 
denoted by G r {-), and the probability mass function (p.m.f.) by 
g(r, •), i.e., g(r,-f) := P(T r = 7) for all 7 e 5; 7) is the 
probability that at least the transmit power level 7 is required 
to establish a link of length r. Since the required transmit 
power increases with distance, we assume that {G r }r=i,2,... 
is a sequence of distributions stochastically increasing (for 
definition, see Q) in r. We also need to talk about a specific 
link, say e; we will denote the transmit power required for this 
link by T {e \ We assume that the powers required to establish 
any two different links in the network are independent, i.e., 
r^ 61 ' is independent of T^ 2 ^ 1 for e\ =t e%. Spatially correlated 
shadowing will be considered in our future work. 

E. Deployment Objective 

In this paper, we do not consider the possibility of another 
person following behind, who can learn from the measure- 
ments and actions of the first person, thereby supplementing 
the actions of the preceding individual. Our objective is to 
design relay placement policies so as to minimize the sum of 
the expected sum power/ maximum power (to deliver a packet 
from the source node to the sink node) and the expected cost 
of placing the relays (the expected number of relays multiplied 
by the relay cost, £). By a standard constraint relaxation, 
this problem also arises from the problem of minimizing the 
expected sum/ max power, subject to a constraint on the mean 
number of relays. Such a constraint can be justified if we 
consider relay deployment for multiple source-sink pairs over 
several different lines of mean length L, given a large pool of 
relays, and we are only interested in keeping small the total 
number of relays over all these deployments. 

The max-power objective is a valid one in a typical sensor 
network setting since each of the battery-operated relays must 
use as little power as possible in order to maximize the 
network lifetime. The sum-power objective may be useful in 
a different scenario. Consider a mobile station (MS) trying to 
establish a multihop connection to a base station (BS) at an 
unknown distance in order to download data, and there is a 
continuum of other nodes between them. Each node can be 
used as a relay only if it is paid a certain price. The price 
may have a fixed component (corresponds to the cost £ of 
a relay) and a variable component proportional to the power 
used by the relay to serve the MS. The MS could send out a 
probe towards the BS; the probe needs to sequentially establish 
a multihop path using other nodes along the path as relays. 
The formulation for this problem will be analogous to that for 
the sequential relay placement problem with the sum-power 
objective. Also, in the context of global energy saving, it is 
interesting to have relay deployment policies that minimize 
the sum power. 

Note that our problem formulation is applicable to the 
situation in which a relay can be set to a low power state 
except when it has to receive or transmit. If the relays always 
keep their receivers on, with the current drawn from the battery 



in the receiving mode being the same as the current required 
at the transmit mode, then the battery lifetime will depend on 
the current drawn in the receive mode, since for light traffic 
the node will transmit rarely. Also note that, our formulation 
is capable of using any monotonically increasing function of 
the power at a node as the objective to be optimized, rather 
than directly using the sum/max power objective. The function 
could denote the current requirement for a particular transmit 
power, which will in turn govern the lifetime. 

F. Routing over the Deployed Relays 

After node deployment, the routes could be constrained so 
as to allow transmissions only between adjacent nodes, i.e., 
the routes use solely the links represented by the solid lines in 
Figure [T[ we consider this problem in Section III However, 
after deployment, it may turn out that it is better that the 
route from the source to the sink skips some relays (e.g., in 
Figure [T] if the channel between the source node and relay 2 
is very good, it could be better to directly transmit from the 
source node to relay 2 without using relay 3). Hence, while 
formulating the problem, it would be beneficial to permit the 
possibility that some of the dotted links in Figure [T] can be 
used after deployment; this problem is solved in Section IV 



III. Relaying via Adjacent Previous Node Only 

In this section we allow relaying from the source to the 
sink only by each relay passing the packet to the immediate 
previous relay, in the order of deployment. Thus, this is the 
measurement-based extension to the problem in 1 8 1 . 

A. Sum-Power Objective 

1 ) Problem Formulation: Our problem is to place the relay 
nodes sequentially such that the expected sum of the total 
power cost and the relay cost is minimized. We formulate this 
problem as an MDP with state (r, 7), where r is the distance 
of the current location from the previous node and and 7 is 
the transmit power required to establish a link to the previous 
node from the current location. Based on (r, 7) a decision is 
made whether to place a relay at the current position or not. 
denotes the state at the beginning of the process (at the sink 
node). When the source is placed, the process terminates and 
the system enters and stays forever at a state e. The action 
space is {place, do not place}. The randomness comes from 
the random length L and the randomness in T^ e \ 

The problem we seek to solve is: 

• N+l 

(1) 



minEJ ^ +£N\ 

we ^ i=i ' 



where n is the set of all stationary deterministic Markov 
placement policies and ir is a specific stationary deterministic 
Markov placement policy. Any deterministic Markov policy 
7r is a sequence of mappings {fik}k>i, where fi^ takes the 
state of the system at time k (the fc-th step from the sink, in 
the context of our problem) and maps it into any one of the 



two actions {place, do not place}. If [if. does not depend on k, 
then the policy is called stationary policy. By proposition 1.1.1 
of 0, we can restrict ourselves to the class of randomized 
Markov policies. The justification for restriction to stationary 
deterministic policies will be given later. 

Solving ([TJ also helps in solving the following constrained 
problem (see [ 10 1): 



sN+l 

minEJ V r^" 1 ' 
wen V ^ 

v i=i 



such that E^N < M, 



(2) 



where M is a constraint on the mean number of relays 
deployed. In this paper, however, we consider only the un- 
constrained problem. 

If the state is (r, 7) and a relay is placed, the relay cost £ 
and the power cost 7 is incurred at that step. We do not count 
the price of the source node, but include the power used by 
the source in our cost. No cost is incurred if we do not place a 
relay at a certain location. Note that also denotes the state 
immediately after placing a relay, since the process regenerates 
whenever a relay is placed (this follows from the memoryless 
property of geometric distribution and the independence of 
r&rt and r( fe ' ; ) for ^ (k,l)). Let us define J ? (r, 7 ) and 
Jj (0) to be the optimal expected cost-to-go starting from state 
(r, 7) and state respectively. 

2) Bellman Equation: Here we have an infinite horizon 
total cost MDP with a countable state space and finite action 
space. The assumption P of Chapter 3 in (9) is satisfied 
here, since the single-stage costs are nonnegative. Hence, by 
Proposition 3.1.1 of (9), the optimal value function </{(•) 
satisfies the following Bellman equation: 



•^0,7) 



miiW ? + 7 + ^(0), 



0E(r r+1 ) + (l-0)EJ e (r+l,r r+1 ) 



J ? (0) = 0E(Ti) + (l-0)EJ r e (l,r 1 ) 



(3) 



c p in ^ is the cost of placing a relay at the state (r, 7), and 
c np is the cost of not placing a relay. 

If the current state is (r, 7) and the line has not ended yet, 
we can take either of the two actions. If we place a relay, a 
cost (£ + 7) is incurred; another cost J^(0) is also incurred 
since the decision process regenerates at the point. If we do 
not place a relay, the line will end with probability 9 in the 
next step, in which case a cost E(T,. + i) will be incurred. If 
the line does not end in the next step, the next state will be 
(r + 1,7') where 7' ~ G,.+i and a mean cost of EJ^(r + 
1, r r+ i) = J2-y 9{ r + 1' l)J^{ r + 1) 7) w iU t> e incurred. Note 
that it is never optimal to place a relay at state 0. If it were 
so, then we would have placed infinitely many relays at the 
sink, leading to infinite relay cost. But if we place one relay 
at each step until the line ends, the expected cost will be less 
than (| + 1)(£ + E(Ti)) < 00. Hence, the optimal action at 
state would be to move to the next step without placing 



the relay. In the next step the line ends with probability 9, in 
which case a cost E(Ti) is incurred. If the line does not end 
in the next step, the next state will be (1,7) where 7 ~ G\. 

Justification for restricting to the class of stationary, deter- 
ministic policies: From (BJ, we see that for each state, there is 
one action from the set of actions that achieves the minimum 
in the Bellman equation. Hence, by Proposition 3.1.3 of [9|, 
we have a stationary deterministic optimal policy. Hence, we 
can focus on the class of stationary, deterministic policies. 

3) Value Iteration: The value iteration for (TTT> is given by: 



r(fc+l) 



(0) 



0E(ri) + (1 - 6)Ejf> (l.Tx) 



J^'ir,!) = mmU + j + ji k) (0),9E(T r+1 ) 



(1- 



)E J f (fc) (r - 



i,rv 



(4) 



where jjf \r,j) = for all r, 7 and J^'(0) = 0. 

Lemma 1: The value iteration Q provides a nondecreasing 
sequence of iterates that converges to the optimal value func- 
tion, i.e., J { ( k) (r,j) t Mr,i) for all r, 7 , and J ? (fe) (0) f M°) 
as k j" 00. 

Proof: See Appendix |A| ■ 

4) Policy Structure: 

Lemma 2: (r, 7) is concave, increasing in 7 and £ and 
also increasing in r. Jf(0) is concave, increasing in £. 

Proof: See Appendix |A| ■ 

Theorem 1: Policy Structure: The optimal policy for Prob- 
lem ([TJ is a threshold policy with a threshold 7,/,(r) increasing 
in r such that at a state (r, 7) it is optimal to place a relay 
if and only if 7 < 7r/i(V). This corresponds to the condition 

Proof: See Appendix |A] ■ 

Remark 1: If 7 = 7r/,(r), either action is optimal. 

Discussion of the Policy Structure: We do not place at an r if 
the required power is too high, as one might expect to get a 
better channel if one takes another step. For each r, there is a 
threshold on 7 below which we place. This threshold increases 
with r since the distribution of T r is stochastically increasing 
with r. 

Note that the optimal policy in Theorem [T] can also be stated 
as follows: place a relay if and only if r < r t h{^) (i.e., c p < 
Cnp) where r t h(j) is some threshold on r, increasing in 7. 
Moreover, if there is a function d(-) such that g(r,d(r)) = 1 
for all r, d(r) is convex increasing in r, d(r) j" 00 and df(r) t 
00 as r f 00, then we have the same problem as (8), in which 
case it is optimal to place if and only if r > r t h- 

5) Computation of the Optimal Policy: Let us write 
V ( (r) := EJ e (r,r r ), i.e., V 6 (r) := £ 7 s(r,7)</« (?~,7) for 
all r £ {1,2,3, •••}, and V ( {0) := J e (0). Also, for each 
stage k > of the value iteration define V^ k \r) := 



(0), 



E, 



jf 5 (r,r P ) and y ? w (0) := JW(0 



(fc) 



r(fe) 



Observe that from the value iteration Q, we obtain: 



V 



(fc+i) 



(r) 



7 



ff (r, 7 )minU + 7 + ^ (fc) (0), 



rm(r r+1 ) + (1-0)^^ + 1) 



(fc+1) 



(0) 



= 6»E(r 1 ) + (l-6')l/ c (fe) (l) 



(5) 



Since J, 



(fc)/ 



' 5 '(r,-/) t Jf(r,7) for each r, 7 and J £ (fe) (0) f Jj(0) 
as fc j" 00, we can argue that V^ k) (r) t EJ £ (r,r(r)) for all 
r g {1, 2, 3, • • • } (by Monotone Convergence Theorem) and 
vf°(0) t Jf(0). Thus, T^ (fc) (r) t Vf(r) and V^ k \o) f V € (0). 
Hence, by the function iteration (|5j, we obtain Vg(Q) and Vj(r) 
for all r > 1. Then, from p), we can compute 7//,(r). Thus, for 
this iteration, we need not keep track of the cost-to-go values 
(r, 7) at each stage k; we simply need to keep track of 



6) A Numerical Example: We take 6 = 2 meters and 
6 = 0.025, (i.e., L = 40 steps, or 80 meters), and S = 
{-25,-20,-15,-10,-5,0,3} in dBm. Using a standard 
model, with transmit power Pj~ (mW), the received power 
(in mW) at a distance r from the transmitter is given by 
PTOc{-^)~ r, HlQ~Tx where a is a constant and tq is a refer- 
ence distance. H models Rayleigh fading, and is exponentially 
distributed with mean 1. v is assumed to be distributed as 
Af(0, a 2 ) with (7 = 8 dB; i.e., we have log-normal shadowing. 
The shadowing is assumed to be independent from link to 
link. For a commercial implementation of the PHY/MAC of 
IEEE 802.15.4 (a popular wireless sensor networking stan- 
dard), —88 dBm received power corresponds to a 2% packet 
loss probability for 140 byte packets. Taking —88 dBm to 
be the minimum acceptable received power, we set the target 
received power (averaged over fading) to be ip = 10~ 75 mW 
(i.e., —75 dBm), which (under Rayleigh fading) yields an "out- 
age" probability of 5%. Hence, the transmit power required to 
establish the link is: 

lb fry 

P req = — - (8) 
10 wa \r J 

Now, since the set S is bounded, at large distance the 
required power to achieve ip will exceed 2 mW (i.e., 3 dBm) 
with high probability. To tackle this problem, we take the 
following approach. We modify the problem formulation by 
requiring that a relay is placed when r = 10 (in steps). 
For 1 < r < 10 (in steps), we obtain T r by inverting the 
path-loss formula and using the next higher power level in 
S. For example, if the required transmit power to establish 
the link, obtained from ([8]l, is in (—5,0] dBm, then we say 
that the power requirement is dBm for that link. But, if the 
power requirement is more than 3 dBm, then we say that the 
required power is 3 dBm. It is easy to see that the distribution 
G r (-) of r r obtained in this fashion is stochastically increasing 
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TABLE I 

Relaying via the last placed r elay: b reak-up of the optimal 

COST FOR THE EXAMPLE IN SECTIQn [III-A| FOR THREE VALUES OF THE 
RELAY COST f. 
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Fig. 2. Relaying only via last placed relay: "fth( r ) vs - r > f° r various relay 
costs, f, for the numerical example described in Section [III-A6| 

|^] in r. For the numerical work, we assume that rj = 2.5, 
= 10~ 3 /rn~ 2 - 5 (-30 dB). With these parameters, the 
probablity of the transmit power required to achieve a target 
received power ip — — 75 dBm (averaged over fading) for a 
link of length 10 steps (20 meters) exceeding 3 dBm (which 
is the maximum transmit power) will be less than 2.65%. This 
probability will be even less for links having length smaller 
than 20 meters. Note that these comment imply that there is 
a positive probability of deployment failure; we will quantify 
this later in Section IIII-A7I 

Figure |2] shows the variation of 7,/,(r) with r and £. Here the 
unit of £ is mW/relay, so that the unit of Jf(0) is also mW. 
We see from Figure [2] that, for £ = 0.001, if the previous 
relay is 8 meters behind, and the required power is —20 dBm 
or below, then a relay should be placed at this point because 
lth{f = Smeters) = —20dBm. Also, note that —25 dBm is 
the smallest possible transmit power level, and for £ = 0.001 
we will never place a relay at r = 2 meters since there jth ( r ) 
is mW (—00 dBm). The variation of j,h{r) with r has already 
been established in Theorem [T] But Figure [2] also shows that 
7,/,(r) is decreasing in £ for each r. This is intuitive because 
£ is the price of a relay, and all it says is that as the price of 
a relay increases, we will place the relays less frequently. 

The variation of the mean number of relays E(N) and 
various cost components with £ is shown in Table [I] It shows 
that as the cost of a relay £ increases, the mean number of 
relays decreases, and the power cost and Jg(0) increase. 



2 The results asserted for the formulation in Section |III-A| e.g., threshold 
structure of the optimal policy, hold for this variation as well. The only change 
will be that at r = 10 steps the only feasible action is to place the relay. We 
can use the same iterations as |4j and jfj) to compute the optimal policy in 
this case, except that we need to set J^'(r, 7) or vi fc '(r) to 00 at each 
iteration if r is 11 steps or more. This will force the policy to place the relay 
within every 10 steps. 



Ji{r,lilmax) = mini £ + 0E max{7, 7, nflx , Ti} + (1 - 6»)EJ 4 (1, Ti, max{7, 7 mm }), 



6>Emax{7 fflflA ., T r+1 } + (1 - 9)EJ i (r + 1, T r+1 ^ max ) 



(6) 



jf^iTilHmax) = min I £ + 6'Emax{7,7 fflfl . t ,ri} + (1 - 6)EJf'(l, Tt, max{7, j max }), 



0Emax{7 mfl „r r+1 } + (1 - 0)Ejf \r + 1,T 



(7) 



7) Deployment Failure: Since, in practice, there is a maxi- 
mum power at which a transmitter can transmit (e.g., 3 dBm), 
there is a possibility that a deployment can fail. It is interesting 
to compute the probability of such failure in the algorithms 
that we have derived. Here we provide a simulation estimate 
of the deployment failure probabilities for the sum power 
objective, under the threshold policies obtained numerically in 



Section III-A6 In our numerical example in Section III-A6 



deployment failure can occur in the following two cases: (i) 
at r = 10 steps the required power exceeds 3 dBm; the 
probability of this event is 2.65% , and (ii) the source at 
the end of the line requires more than 3 dBm power. By 
simulating 200000 deployments, we observe that the deploy- 
ment failure probability for £ = 0.001,0.01,0.1 and 1 are 
0.025%, 0.057%, 0.555% and 3.8% respectively. 

Evidently, there is a trade-off between the target link 
performance and the probability of link failure. In addition, 
by placing relays more frequently, we reduce the chance of 
being caught in a situation where the deployment operative has 
walked too far without placing a relay and is unable to get a 
workable link to the previous node. In future work, we propose 
to include deployment failure probability as a constraint in the 
optimization formulation. Another way to reduce deployment 
failure is to permit back-tracking by the deployment operative, 
which, of course, will require the placement algorithm to keep 
more measurement history; we propose to permit this in our 
future work as well. 



B. Max-Power Objective 

1 ) Problem Formulation: 
problem: 



We aim to address the following 



min Ej 

wen 



f max rt^" 1 ' + £iV ) 

Vi6{l,2,-,JV+1} / 



(9) 



We formulate (|9]l as an MDP. The state of the system is 
(r, 7, ^max) where r and 7 are the same as before, and 7 mal 
is the maximum power used in all the previously established 
links. The action space is {place, do not place} as before. The 
cost structure is such that the power cost is incurred only after 
the source node is placed. 

2) Bellman Equation: The problem is again an infinite 
horizon total cost problem with countable state space, finite 



action space and nonnegative single-stage cost. Hence, by 
the same arguments as used in problem ([T]), the optimal 
value function J^(-) satisfies the Bellman equation {61. At 
state 0, it is not optimal to place a relay. Hence, Jg(0) = 

6»E(r 1 ) + (i-6»)Ej ? (i,r 1 ,o). 

At state (r, 7, j max ), if we place a relay, we incur a cost 
£ and in the next step the line ends with probability 9 in 
which case a power cost of Emax{7, 7 mflx , Ti} is incurred. 
If the line does not end in the next step, the next state 
becomes (1, 7', max{7, 7„ M<: }) where 7' ~ G\, and a cost 
of EJ^(1, r 1; max{7, 7„ MV }) is incurred. On the other hand, 
if we do not place a relay at state (r, 7,7„ MX ), the line ends 
in the next step with probability 9 in which case a power cost 
of Emax{7„ Mj: , r r+ i} is incurred. If the line does not end in 
the next step, the next state will be (r + ,j m at) where 

i ~ G r+1 . 

3) Value Iteration: The value iteration for this MDP is 



given by Q with J^'> (r, 7, -y max ) = for all r, 7, -f max . 

Lemma 3: The iterates of the value iteration dTli con- 
verge to the optimal value function, i.e., '{r, 7, 7 mtn: ) t 
^( r :7-7m fl .v) for all (r, 7, 7„ WA ), as k | 00. 

Proof: See Appendix |A] ■ 
4) Policy Structure: 

Lemma 4: 7,7 mttr ) is concave, increasing in £ and 
increasing in r, 7, -f max . 

Proof: See Appendix |A] ■ 

Theorem 2: Policy Structure: The conditions for optimal 
relay placement are: 

(i) If 1 < Imax, place the relay when r > r th {^ max ) where 
fth(lmax) is a threshold value. 

(ii) If j > 7 fflflt , place the relay when 7 < j, h (r,j max ) where 
Ithir^max) is a threshold value increasing in r and j max . 

Proof: See Appendix |A] ■ 

Discussion of the policy structure: When 7 < 7 mflv , we can 
postpone placement until the point beyond which the chance 
of getting a worse value of power becomes significant. For 
7 > Imax, waiting to place the relay may result in a better 
channel; there is a threshold 7/,( r >7mo.v) suc h that 7,/, (r, 7 m£K ) 
may cross j max for large enough r. If 7 is between these two 
values then we place. 



r(0)/ 





€ = 0.001 


£ = 0.01 


£ = 0.1 


E(N) 


18.1178 


8.6875 


4.6615 


Relay Cost 


0.01812 


0.08688 


0.46615 


Power Cost 


0.01524 


0.04436 


0.15079 


J e (0) 


0.03336 


0.13124 


0.61693 



TABLE II 

Relaying via the last placed r elay: b reak-up of the optimal 

COST FOR THE EXAMPLE IN SECTIOn [III-B| FOR THREE VALUES OF THE 
RELAY COST f . 



5) Computation of the Optimal Policy: Let us define 
7«ax) : = E M r i r r,lmax)- We can again argue that 
the following function iteration (similar to that used in Sec- 
tion III-A i) will yield (r, jmax) for all r, j max , from which 
we can compute r lh (j max ) and ■yn 1 (r,j ma ,): 



Vf +1) (r,w) = 5>(r l7 )min{s 



+ 0E max{7, j max , 17} 



+ (l-9)V 5 (fc) (l,max{ 7 ,7„, m .}), 
6»Emax{7 m „,r r+ i} 

+ (l-«)^'(r+l,7„) 
= for all r, -y max . 



(10) 



with V 5 (0) (r,7 mfl 

6) A Numerical Example: Figure [3] shows the variation of 
fthilmax) with j max and £. Here we consider the same setting as 
Section |HI-A6 The plot shows that r,;,(7„, flA ) increases with 
Imax- To get an insight into the reason, let us consider the 
situation 7 < 7 mflA . If r is small, then it is more likely that in 
the next step also the power required to establish a link to the 
last node will be below 7„ MX , and hence, we don't need to place 
a relay. But if r is large, then it is more likely that the required 
power will cross j max in the next step, and hence we will have 
a threshold r,/ ! (7„ MA ) beyond which we have to place the relay. 
As j max increases, the probability that the power required to 
establish a link to the last node exceeding j max decreases for 
each r, thereby increasing r t h{"fmax)- Also, r,h{^„ ulx ) increases 
with £ because if the price of a relay increases, we will place 
relays less frequently. 

Figure |4] shows the variation of 7ta(r, 7maa) with r for 
Imax — — 20 dBm and two different values of £. For very 
small £ (e.g., £ = 0.00001), j th (r ) can be more than 
Imax for all r. For moderate values of £ (e.g., £ = 0.01 
in Figure [3]), -f t h(r,jmax) vs. r curve crosses j max at r = 
Tthilmax)- Also, we have seen numerically that for £ very 
large, 7i/i(r, 7 max ) is mW for r < 9 steps; for large £ we 
will place a relay only when r = 10 steps (20 meters). 

The variation of the mean number of relays E(N) and 
different cost components with £ is shown in Table |TTJ It 
shows that as the cost of a relay £ increases, the mean number 
of relays decrease, and the power cost and Jj(0) increase. 
Note that, for any given deployment and any given relay cost 
£, the sum power is always greater than the max power in 
the network. Hence, for a given £, the mean power cost and 
Jf(0) for the sum power objective will be greater than the 
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Fig. 3. Relaying only via last placed relay: r^lmax) vs. -f„ mx , for various 
relay costs, £, for the numerical example described in Section [lII-B6| 
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—20 dBm and various relay costs, £, for the numerical example described in 
Section lrTTB6l 



corresponding values for the max-power objective, as seen in 
Tableland Table |n] 

The estimated probability of deployment failure obtained 
from 200000 simulations of the deployment for £ = 
0.001,0.01,0.1 and 1 are 0.01%, 0.145%, 0.73% and 5.39% 
respectively. 

IV. Relaying via Any Previous Node 

In Section [Hi] we considered the case where, after the 
deployment is over, only the links between adjacent nodes are 
permitted, i.e., only the links represented by the solid lines 
in Figure [TJ can be used. However, as discussed in Section [TTJ 
while formulating the problem we need to take into account the 
fact that some relays might be skipped after deployment, i.e., 
some of the links represented by the dotted lines in Figure [T] 
can be used. This section is dedicated to such formulation and 
exploration of the structural properties of the relay placement 
policies for different objectives. 

A. Sum-Power Objective 

1) Problem Definition: Given a deployment of N relays, 
indexed 1,2, •■• ,N, consider the directed acyclic graph on 
these relays along with the sink (Node 0) and the source (Node 
N + 1), whose links are all directed edges from each node to 
every node with smaller index. Hence, if i and j are two nodes 
with i > j, there is only one link (i, j) between them. Consider 
all directed acyclic paths from the source to sink, on this graph. 
Let us denote by p any arbitrary directed acyclic path from 



^f{^}fc=i;{-P (fe) }LU;{7 (fc) }I? =1 ) =min{£ + 0Emin{ min (T H+1 + P<*>), Tx + min t^+pW)) 
+ (1-0)EJ { ( 1,2/1 + 1,... )V „_i + l; min ( 7 W + pW), P«, ■ ■ ■ , pC™" 1 ); n, r H1+ i, • ■ ■ , r„ n _ 1+1 Y 

V ke{l,— ,n} J 

m ke{ T?. M ( r «+! + p(fc ') + ^ - e ) EJ e({^ + i}^ =1 ; {P (fc) }', l =1 ; {r, fe+ i}^ =1 ) } (ll) 

J 5 f{y fc }r=i;{^ W }r=i;{7 (fc) }r=i) =min{5 + eEmin{ min (T M+1 + pW), ^ + min ( 7 «+P«)) 
+ (l-e)Ej/l 1?/1 + l,--- ,2/m + l; min (<yW + pW), P P) , . . . , pC™) ; n , r„ 1+1 , ■ ■ ■ , r Vm+1 Y 

\ ,m} / 

^fee^,™} ( r ^ +1 + p<fc) ) + (1 - e ) EJ e({^ + i}?=i! {^ (fc) }™ =1 ; {r, fc+ i}r=i) } 02) 



the source to the sink, and by E(p) the set of (directed) links 
of the path p. We also define V n := {p : € E(p) 
i > j, \i — j\ < n} a subcollection of paths between the source 
and the sink on the directed acyclic graph, such that no path 
in V n contains a link between two nodes whose indices differ 
by a number larger that n. We call n the "memory" of the 
class of policies we are considering. 

Here we consider the following problem: 



min E„ 
wen 



mm 

pev n 



eGE(p 



r (e) + £N 



(13) 



where r( e ) is the power used on the link e. We 
call X)eGE(p) ^ me "length" of the path p, and 
min pe -p n SeeE(p) r' 6 ' 1 me length of the "shortest path" from 
the source to the sink (over the relays deployed by policy ir 
in a given realization of the decision process). 

2) MDP Formulation: Consider the evolution of the net- 
work as the relays are deployed. Suppose that at some point in 
the deployment process there are m preceding nodes, including 
the sink; see Figure [5] where m = 4. The transmit power 
required to establish a link from the current location to the 
fc-th previous node is denoted by and the distance of the 
current location from the fc-th previous node is denoted by y k . 

(k) 

Let P,\ denote the length of the "shortest path" from the fc-th 
previous node to the sink. We define pi™ 1 ' := if m < n, i.e., 
the length of the shortest path from the sink to itself is (when 
m < n, the m-th previous node is the sink). For notational 
simplicity, we drop the subscript and denote P„ by p( fe ). The 
deployment operative decides whether to place a node at his 



(1)_ 7 (2) 



,7 



(«) 



current position based on (i) the powers 7 
(ii) the distances yi,y2,-" iVn< an( l (iii) the length of the 
shortest paths P (1) , P (2 \ • • • , P (n) . If n = 2, at the "current 
location" shown in Figure |6| the decision will be based on the 
powers 7' 1 ', j^K the distances y\, y2, and the shortest paths 
P' 1 ) and P' 2 ) at nodes 3 and 2 respectively. However, in case 
m < 71, we do not have measurements for 71 previous nodes. 
Hence, let us define l m := min{m, 71}. At each step, the 
deployment operative knows the distance {j/fcj^p the power 
{l^YkLi and the lengths of the shortest paths {P( k ty l £L v 
He decides based on this information whether to place a relay 
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Fig. 5. Measurement-based sequential relay placement. When standing at the 
"current location," the deployment operative, having already deployed Relays 
1, 2, and 3, makes the power measurements 7 ' fc \ and knows the distances 
Vk- p(i) 
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Fig. 6. Sequential deployment of relay nodes with n = 2. When standing 
at the "current location," the deployment operative, having already deployed 
Relays 1, 2, and 3, makes the power measurements 7 ' 2 ', and knows 

the distances y\, j/2 and the lengths PW, p( 2 ) of the shortest paths from 
relay 3 and relay 2 to the sink node. 

at the current position or not. We formulate this problem as 
an MDP with state ({y*}^; {P (fe) }l™ x \ {l {k) } k m =1 ), and the 
action space {place, do not place}. The state at the sink is 
denoted by 0. Since the set S of transmit power levels is 
countable, {P < - fe - ) } / fc ™ =1 also take values from a countable set. 
Hence, the state space is countable in our problem. 

If the state is ({y*}^; {7 (fe) }L=i) and a relay 

is placed, the relay cost £ is incurred. The power cost is 
incurred only after the source is placed, and that cost will 
be the length of the shortest path in V n from the source to the 
sink. Let us define ^({l/*}^; {P^}^; {7 (fe) )L=i) and 
J^(0) to be the optimal expected cost-to-go starting from state 
({yk} k m =l ,{P (k) } k m = iAl (k) } k m =i) and state respectively. 

3) Bellman Equation: Note that here again we have an infi- 
nite horizon total cost MDP with a countable state space, finite 
action space and nonnegative single-stage cost. Hence, the 
optimal value function J^(-) satisfies the Bellman equations 
( fPT| ) for m > n and ( p"2"| ) for m < n, for the optimal cost 
function. The first term in the min{-, •} is the cost if we place 
a relay at the state {{y k } l k m =l ] { p(k) } k =v {7 (fe) lL=i)> and the 
second term is the cost if we do not place a relay. 



^({2/fc}fc = i;{-P (fc) }fc =1 ;{7 (fc) }fe=i) = taxtxie + mxaml min max{r Vfe +i, P (fc) }, max-jTi, min max{^ , F (fc) }} ) 

\ /I I fce{l,2,"- ,71—1} fce{l,2,--- ,n} J 

+ (l-0)EJ f (l,yi + l,--- ,j/ n -l + l; min max{ 7 ( fe > , pW}, P* 1 ' , ■ ■ ■ , P^ 1 ' ; Ti , r„ 1+ i , ■ ■ ■ , r Vn _ 1+1 ), 
V k£{l,— ,n} J 

0E fee{ ^ in , n} max{r Mfc+1 , P( fe )} + (1 - 6)EJ S ({y k + 1}» =1 ; {PW}^ =1 ; {r^+i}^) } (14) 

Jcf{j/ fc }^i;{-P (fc) }r=i;{7 (fc) }r=0 = mm{? + f?Emm{ min max{r Hfc+1 , P^}, maxll^ min max{ 7 ( fe >, P< fc '}}} 

+ (l-6>)EJ f (l,ia + l,--- ,2/m + l; min max{ 7 « , p( 1 ' }, P« , ■ ■ ■ , P< m > ; r\ , T V1+1 , ■ ■ ■ , F Vm+1 ), 
\ ,m} / 

9E min max{r w+1 , P«} + (1 - 0)EJ ? ({y* + 1}^ =1 ; {pl")^; {r M+1 }f =1 ) } (15) 



Observe that it is never optimal to place a relay at state 
because, in doing so, a cost £ will unnecessarily be incurred. 
Hence, J e (0) = 6^(1^) + (1 - 6>)E J 5 (l; 0; Tx). 

When to > n, if we place a relay at the current location and 
if the line ends in the next step, the length of the shortest path 
from the source to the sink will be seen as a terminal cost, 
and is equal to Emm{mmfe 6 .ri ... jn _i}(rj, fc+ i + p( fc )),ri + 
mm fce{i,--- .njil^ +P^)}- Note that in this case the shortest 
path from the source to the sink can pass via the relay placed 
at the "current location", or via one of the (n — 1) previous 
relays. For example, in the scenario shown in Figure [6] (with 
n = 2), if we place a relay at the "current location" and the 
line ends at the next step, then the neighbouring node of the 
source along the shortest path can be the relay placed at the 
"current location" or relay 3 (source is not allowed to transmit 
directly to relay 2 because n = 2). Keeping this in mind, 
Tj/fc+i + P( fe ) is the sum of two costs: the (random) power 
T yk+ i from the source to the (k + l)-st previous node w.r.t 
the source (after placing the relay at the current location, the 
current k-th previous node will become the (fe + l)-st previous 
node in the next step, where the source will be placed) and the 
length of the shortest path from that node to the sink. 
r"i + mirifcg.n ... (7W + p( fc )) i s the sum of the random 
power Ti required to establish a link from the source to the 
relay deployed at the current location, and the length of the 
shortest path from this relay to the sink. 

When to > 71, if we place a relay at the current location 
and the line does not end in the next step, the terms y n , pW 
and 7" disappear from the state (because a new relay has 
been placed, which must be taken into account in the state) 
and the distance 1 of the next location from the newly placed 
relay at the current location is absorbed into the state. Other 
distances in the state increase by 1 each. The length of the 
shortest path from the newly placed relay to the sink, i.e., 
min.feg.n... )n }(7^ + P^) enters the state, and the power 
required at the next location to connect to the n previous relays 
(w.r.t the next location) are independently sampled again. 
Hence, keeping in mind that T r is the random power required 
to establish a link between two nodes at a distance r, the new 
state becomes: 



(1,1/1 + 1,--. ,J/„-i + l; min (7 W +P W ), 
fee{i,— ,«} 

pi 1 ) ... p(» i - i )-f 1 p .1 ••• r _i_i) 



Similarly, if to > n and we do not place a relay at the cur- 
rent location, in the next step the line may end with probability 
9 and may not end with probability (1 — 6). If the line ends, a 
cost of the shortest path Emin fce { 1 ... n j(T yk+ i + pw) will 
be incurred. If the line does not end, the next state will be the 
random tuple ({y k + 1}£ =1 ; {P {k) )l =1 \ {lk\l =1 ), where for 
each k £ {1, ■ ■ ■ ,n), 7fc will be drawn independently from 
each other from the distribution Gj/ fc +i(-)j^| 

Similar arguments can be used to explain ( p~2] > in case to < 
n. The difference is that if we place a relay at the current 
location and the line does not end in the next step, the next 
state will have three more terms, since the information for the 
newly placed relay can be accomodated into the state. On the 
other hand, if the line ends in the next step, the source will 
be able to communicate to the sink via one of the to relays 
(there will be (to + 1) preceding nodes, including the sink). 

4) Results and Discussion: 

Theorem 3: Policy Structure: For the state 

({yk}t =1 ;{P {k) }[ m =1 ;{l {k) Y k =il the optimal relay 
placement policy is the following: 

Place a relay if and only if min^^ ... ; }(7^ + p( fe )) < 

threshold value. 

Proof: See Appendix |B| ■ 

Discussion of the Policy Structure: The structure of the 
optimal policy as stated in Theorem [3] is intuitive because here 
we need to check whether the quantity mirife 6 {i ) ... ,;,„}(7' fc ' + 
p( fe ) ) which is the length of the shortest path from the current 
location of the deployment operative to the sink, is below a 
certain threshold. 

Remarks: 



The optimal cost J^(0) of (13 1 is always less than or 



3 It is to be noted that all the Y. terms appearing in 1 1 I [ . jl2| , jl4| and 
|T5J are independent of each other. 



equal to that of (QJ, if the relay price £ is same in both 
cases. This is because each policy for n — 1 will be a 
policy for n = 2 as well. 
• n — oo provides the best policy since there we consider 
information from all previous nodes. 

Observation: With n = 1, r := y\ and 7 := 7W, the 



Bellman equation (111 reduces to: 



J e (V;P (1) ;7^ = minj^ + 6»E(ri +7 + P (1) ) + 
(l-0)EjJl;( 7 + pM);riY 

6»E^r r+ i + p( 1 ^ 

+ (l-0)EJ e fr + l;PW ; r r+ ij| (16) 

Note that J^(r; P^ ;j) = pW + J^(r; 0;j). Let us denote 
Jj(r; 0; 7) := J%(r, 7). Now we can rewrite ( 16 1 as: 

P (1) + J((r,-y\ =P (1) + minU + 7 + 6IE(ri)+EJ ? fl,r 1 Y 
eE^r r+1 j + (l-0)EJ e fr+l,r r+1 H (17) 
Thus, we obtain the Bellman equation ([3]). 
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TABLE III 

Comparison of J^(0) for n = 1 and n = 2 for different £: 
Sum-power objective and max-power objective. Two different 

TRANSMIT POWER SETS ARE USED FOR THE TWO OBJECTIVES. 



requirement The study suggests that that, for small relay 
cost, n — 2 can provide a significant percentage gain over 
the optimal cost for n = 1. Since at small £ we tend to place 
more relays (but the relay cost is small compared to Jf(0), 
see Table [I] and Table [III, skipping relays could be useful. 
For large £, we place very few relays, but the relay cost will 
dominate. As £ becomes very high, we will always place the 
relays periodically at every 10 steps, and nowhere else; hence 
the relay cost becomes independent of n. The little variation 
in power cost will be insignificant compared to large amount 
of relay cost. 



B. Max-Power Objective 

Here we are going to address the following problem: 



min E„ 
Tren 



min max 

p£P„ eGE(p) 



(18) 



We call max e6E ( p ) r( e ) the "length" of the path p, and 
min pe -p n max eS E(p) T^ 6 ^ the length of the "shortest path" 
from the source to the sink. Using notation and arguments 
similar to those used in problem ( |13) , we can write the 
Bellman equations ( 14 1 and ( fT~5] > and derive the structure of 
the optimal node placement policy: 

Theorem 4: Policy Structure: 

({y*}h 1 ;{i >{ * ) }L=i;{7 w }ti). 

placement policy is the following: 

Place a relay if 
min fee{1> ... jx m&x{^ k \ P^} < 
where c{{y k }£L x ; {P^} 1 ^ 1) is a threshold value 

Remark: Note that the Bellman equation |6]) can be derived 
from ( 14 1, with n = 1, r := yi, 7 := 7' 1 ' and P^ = j m ax- 



For the 
the optimal 



state 
relay 



and only 



if 

,c =1 ) 



C. Performance comparison between n — 1 and n = 2 

We have made a comparative study of the performance of 
the optimal policies with memory 1 and the policies with 
memory 2. The results are shown in Table III] Here we have 
used the same model as used in Section [Hi in the max-power 
case, but we have considered S — {0.1,0.2, • • • ,2} mW in 
the sum-power case in order to avoid huge computational 



D. Computational Issues 

The dimension of the state space is 3n (increasing in n) in 
the value iteration, and hence the computational complexity 
and memory requirement increases with n. However, for any 
arbitrary n, we can reduce the value iteration to a function 
iteration in the same way as in |5]l and (10 1, and reduce the 
dimension of the domain of the function to 2n (instead of 3n 
in the value iteration). 

V. Conclusion 

In this paper, we explored several sequential relay place- 
ment problems for as-you-go deployment of wireless relay 
networks, assuming very light traffic. The problems were 
formulated as MDPs, optimal policies were derived, and 
the procedure illustrated via numerical examples. There are 
numerous issues to improve upon: (i) the light traffic ("lone 
packet model") assumption, (ii) the assumption of independent 
shadow fading from link to link, and (iii) the deployment 
failure issue. Extension to positive traffic might require a 
different approach: perhaps one that requires a performance 
analysis model working in conjunction with an optimal se- 
quential decision technique. We are addressing these issues in 
our ongoing work. 

4 If the transmit power levels in mW are integer multiples of some basic 
power level, the lengths of the shortest paths will also be integer multiples of 
that basic power level. If the transmit power levels do not satisfy this property, 
the number of possible shortest paths can be very large, leading to enormous 
computational complexity. This case will not arise in the max-power case 
since, in that case, a shortest path will always take its values from the set S. 



Appendix A 

Only Links between adjacent nodes permitted 

Proof of Lemma [IJ Here we have an infinite horizon 
total cost MDP with countable state space and finite action 
space. The assumption P of Chapter 3 in O is satisfied since 
the single-stage cost is nonnegative. Hence, by combining 
Proposition 3.1.5 and Proposition 3.1.6 of J9), we obtain the 
result. 

Proof of Lemma |ij In value iteration (j^J, (r, 7) := is 

concave, increasing in 7, £ and increasing in r and ji°'(0) := 
is concave, increasing in £. Suppose that (r, 7) is 

concave, increasing in 7, £ and increasing in r and (0) is 
concave, increasing in £ for some k > 0. Note that E(T,. + i) 
is increasing in r. 

Let us consider 77 > r 2 . We can write: 



EJ^\n + l,T ri+1 ) 
7 

> ^.9(^ + 1,7)4^2 + 1,7) 

7 

> ^.g(r 2 + l, 7 )jf ) (r 2 + l, 7 ) 



EJ e (fe) (r 2 + l,r r2+1 ) 



where the first inequality follows from the fact that J^ k \r + 
1,7) is increasing in r for each 7, and the second inequality 
follows from the facts that J"| (r + 1,7) is increasing in 7 
and G r ( ) is stochastically increasing in r. Hence, EJ [k \r + 

(r,7) is 



3, by (jjj), 



J, 



(fc+i) 



l,r r+ i) is increasing in r. Hence, 
increasing in r. 

We know that the minimum of two concave, increasing 
functions is concave, increasing. Note that, each term in 
the min{-, } of (pi) is concave, increasing in 7, £. Hence, 



ji k+1 ^ (r, 7) in concave, increasing in £, 7 and J^^ l; (0) is 
concave, increasing in £. Now, since J| '(r, 7) t <^( r >7) f° r 
each r, 7 and J^^(0) f ^f(O) (by Lemma [li, the results 
follow. 

Proof of Theorem [TJ By Proposition 3.1.3 of GD, if there 
exists a stationary policy {fj,, fi, ■ ■ ■ } such that for each state 
(r, 7), the action chosen by the policy is the action that 
achieves the minimum in the Bellman equation Q, then that 
stationary policy will be an optimal policy. Hence, it is clear 
that when the state is (r, 7), it is optimal to place the relay if 

£ + 7 + j e (o) < <?E(r r+1 ) + (1 - e)Ej i (r + 1, r r+1 ) 



or, 

7 < 0E(r r+ i) + (1 - e)Ej i (r + i,r r+1 ) - (£ + j e (o)) 



Thus, the condition for placing a relay when the state 
in (r, 7) becomes 7 < 7r/,(r), where 7,/,(r) is a threshold 
value. Now, by stochastic monotonicity of T r in r, E(T r+ i) 
is increasing in r. Also, since (r, 7) is increasing in r, 7 and 
F r is stochastically increasing in r, EJ^ (r + l,r r+ i) also is 
increasing in r. Hence, 7,/,(r) is increasing in r. 

Proof of Lemma |H Here we have an infinite horizon 
total cost MDP with countable state space and finite action 
space. The assumption P of Chapter 3 in Q is satisfied since 
the single-stage cost is nonnegative. Hence, by combining 
Proposition 3.1.5 and Proposition 3.1.6 of (9), we obtain the 
result. 

Proof of Lemma |4j Note that, in the value iteration ([7J, 

4 0",7j7max) := is increasing in r, 7 and ~f max and 
concave, increasing in £. Suppose that for some k > 0, 
) is increasing in r, 7, j max and concave, 
increasing in £. Since (r, 7, 7 ma ,a:) is increasing in r, 
7 and G r (-) is stochastically increasing in r, EJ^ k \r + 
l,r r+ i,7 mQX ) is increasing in r. EJ^ k \r + 1, T r+1 , -f max ) is 

also increasing in j max , since J^ k \r,j,j max ) is increasing 
E(-f max , r r+ i) is increasing in 7 max and also in- 
creasing in r since G r is stochastically increasing in r. On the 
other hand, the first term in the min{-, ■} of (jT]) is independent 
of r, but increasing in 7, -f max and linearly increasing in £. 
Now, minimum of two increasing functions is increasing and 
the minimum of a linear function and a constant is concave. 
Hence, 7, Jmax) is increasing in r, 7, 7 maa; and con- 

cave, increasing in £. Since (r, 7, j max ) t J^(r,J,7max), 
the results follow. 

Proof of Theorem [2j By similar arguments as used in the 
proof of Theorem [T[ the condition for placing a relay at a state 
(r, l, imax) is: 

i + 8E max{ 7 , lma , 17} + (1 - 0)EJ i (1, 17, max{ 7 , >„„}) 

< 6»Emax{ 7 „ rav , r r+ i} + (1 - 6»)EJ e (r + 1, r r+ i, 7 ,„„) (19) 

Note that Emax{ 7mflj: , r r+ i} increases in r. Also, J^(7- + 
l,7,7 mar ) is increasing in r and 7 for all 7 mflx . Hence, 
EJ^(r + l, r r+ i, 7 mflj: ) is increasing in r, since G r is stochasti- 
cally increasing in r. Hence, the R.H.S of (fl9| is increasing in 
r and 7„ Ml . Now, if 7 < "/max, the L.H.S of (|19|l is independent 
of 7. Hence, the condition for placing the relay is r > r,h {"fmax) 
where r,/,(7 maA ) is a threshold value. On the other hand, if 
7 > "fmax, the L.H.S of (19i is independent of 7 mat and 
increasing in 7. Hence, we will place the relay if and only 
if 7 < lrh{r,jmax) where 7 f /,(r,7 m „) is a threshold value 
increasing in r and "fmax- 

Appendix B 
Links between any pair of nodes permitted 

Proof of Theorem [3j By similar arguments using conver- 
gence of the value iteration as in the proof of Lemma|2] we can 
claim that M{yk}^iA p{k) }kLv {l {k) } l ^Li) is increasing in 
each of its arguments. Now, the condition for placing a relay is 



that the first term in the minf, ■} of the R.H.S of ( 1 1 1 or ( 12 1 
is less than or equal to the second term. Since the first term is 
increasing in min^gi! ... ,i m }{"f^ +P^), the threshold nature 
of the optimal relay placement policy is evident. 
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